NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Unified algorithms for RL with Decision-Estimation Coefficients: PAC, reward-free, preference-based learning and beyond

https://doi.org/10.1214/24-AOS2483

Chen, Fan; Mei, Song; Bai, Yu (February 2025, The Annals of Statistics)

Free, publicly-accessible full text available February 1, 2026
Near-Optimal Offline Reinforcement Learning via Double Variance Reduction

Yin, Ming; Bai, Yu; Wang, Yu-Xiang (December 2021, Advances in neural information processing systems)

We consider the problem of offline reinforcement learning (RL) -- a well-motivated setting of RL that aims at policy optimization using only historical data. Despite its wide applicability, theoretical understandings of offline RL, such as its optimal sample complexity, remain largely open even in basic settings such as \emph{tabular} Markov Decision Processes (MDPs). In this paper, we propose Off-Policy Double Variance Reduction (OPDVR), a new variance reduction based algorithm for offline RL. Our main result shows that OPDVR provably identifies an ϵ-optimal policy with O˜(H2/dmϵ2) episodes of offline data in the finite-horizon stationary transition setting, where H is the horizon length and dm is the minimal marginal state-action distribution induced by the behavior policy. This improves over the best known upper bound by a factor of H. Moreover, we establish an information-theoretic lower bound of Ω(H2/dmϵ2) which certifies that OPDVR is optimal up to logarithmic factors. Lastly, we show that OPDVR also achieves rate-optimal sample complexity under alternative settings such as the finite-horizon MDPs with non-stationary transitions and the infinite horizon MDPs with discounted rewards.
more » « less
Full Text Available
Near Optimal Provable Uniform Convergence in Offline Policy Evaluation for Reinforcement Learning

Yin, Ming; Bai, Yu; Wang, Yu-Xiang (April 2021, Proceedings of Machine Learning Research)
null (Ed.)
Full Text Available
An Efficient Real-Time Object Detection Framework on Resource-Constricted Hardware Devices via Software and Hardware Co-design

https://doi.org/10.1109/ASAP52443.2021.00020

Liu, Mingshuo; Luo, Shiyi; Han, Kevin; Yuan, Bo; DeMara, Ronald F.; Bai, Yu (July 2021, International Conference on Application-specific Systems, Architectures and Processors)

Full Text Available
How Important is the Train-Validation Split in Meta-Learning?

Bai, Yu; Chen, Minshuo; Zhou, Pan; Zhao, Tuo; Lee, Jason D.; Kakade, Sham; Wang, Huan; Xiong, Caiming. (July 2021, International Conference on Machine Learning)
null (Ed.)
Full Text Available
An Efficient Video Prediction Recurrent Network using Focal Loss and Decomposed Tensor Train for Imbalance Dataset

https://doi.org/10.1145/3453688.3461748

Liu, Mingshuo; Han, Kevin; Luo, Shiyi; Pan, Mingze; Hossain, Mousam; Yuan, Bo; DeMara, Ronald F.; Bai, Yu (June 2021, Great lake symposium on VLSI (GLSVLSI))
null (Ed.)
Full Text Available
Longitudinal dispersal properties of floating seeds within open-channel flows covered by emergent vegetation

https://doi.org/10.1016/j.advwatres.2020.103705

Liu, Xiaoguang; Zeng, Yuhong; Katul, Gabriel; Huai, Wenxin; Bai, Yu (October 2020, Advances in Water Resources)
null (Ed.)
Full Text Available
Improving STEM Education for Lower-division College Students at HSI by Utilizing Relevant Sociocultural and Academic Experiences: First-year Results from ASSURE-US Project

https://doi.org/10.18260/1-2--34795

Huang, Jidong; Kurwadkar, Sudarshan; Bein, Doina; Bai, Yu; Mayoral, Salvador (June 2020, 2020 ASEE Virtual Annual Conference Content Access)

Despite national efforts in increasing representation of minority students in STEM disciplines, disparities prevail. Hispanics account for 17.4% of the U.S. population, and nearly 20% of the youth population (21 years and below) in the U.S. is Hispanic, yet they account for just 7% of the STEM workforce. To tackle these challenges, the National Science Foundation (NSF) has granted a 5-year project – ASSURE-US, that seeks to improve undergraduate education in Engineering and Computer Science (ECS) at California State University, Fullerton. The project seeks to advance student success during the first two years of college for ECS students. Towards that goal, the project incorporates a very diverse set of approaches, such as socio-cultural and academic interventions. Multiple strategies including developing early intervention strategies in gateway STEM courses, creating a nurturing faculty-student interaction and collaborative learning environment, providing relevant, contextual-based learning experiences, integrating project-based learning with engineering design in lower-division courses, exposing lower-division students to research to sustain student interests, and helping students develop career-readiness skills. The project also seeks to develop an understanding of the personal, social, cognitive, and contextual factors contributing to student persistence in STEM learning that can be used by STEM faculty to improve their pedagogical and student-interaction approaches. This paper summarizes the major approaches the ASSURE-US project plans to implement to reduce the achievement gap and motivate ECS students to remain in the program. Preliminary findings from the first-year implementation of the project including pre- and post- data were collected and analyzed from about one hundred freshmen and sophomore ECS students regarding their academic experience in lower-division classes and their feedback for various social support events held by the ASSURE-US project during the academic year 2018-19. The preliminary results obtained during the first year of ASSURE-US project suggests that among the different ASSURE-US activities implemented in the first year, both the informal faculty-student interactions and summer research experiences helped students commit more to their major during their lower-division years. The pre-post surveys also show improvements in terms of awareness among ASSURE-US students for obtaining academic support services, understanding career options and pathways, and obtaining personal counseling services.
more » « less
Full Text Available
Low-Energy Acceleration of Binarized Convolutional Neural Networks using a Spin Hall Effect based Logic-in-Memory Architecture

https://doi.org/10.1109/TETC.2019.2915589

Samiee, Ashkan; Borulkar, Payal; DeMara, R. F.; Zhao, Peiyi; Bai, Yu (January 2020, IEEE Transactions on Emerging Topics in Computing)

Logic-in-Memory (LIM) architectures offer potential approaches to attaining such throughput goals within area and energy constraints starting with the lowest layers of the hardware stack. In this paper, we develop a Spintronic Logic-in-Memory (S-LIM) XNOR neural network (S-LIM XNN) which can perform binary convolution with reconfigurable in-memory logic without supplementing distinct logic circuits for computation within the memory module itself. Results indicate that the proposed S-LIM XNN designs achieve 1.2-fold energy reduction, 1.26-fold throughput increase, and 1.4-fold accuracy improvement compared to the state-of-the-art binarized convolutional neural network hardware. Design considerations, architectural approaches, and the impact of process variation on the proposed hybrid spin-CMOS design are identified and assessed, including comparisons and recommendations for future directions with respect to LIM approaches for neuromorphic computing.
more » « less
Full Text Available
Compressing Deep Neural Networks Using Toeplitz Matrix: Algorithm Design and Fpga Implementation

https://doi.org/10.1109/ICASSP.2019.8683556

Liao, Siyu; Samiee, Ashkan; Deng, Chunhua; Bai, Yu; Yuan, Bo (May 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP))

Full Text Available

« Prev Next »

Search for: All records